home *** CD-ROM | disk | FTP | other *** search
- <?xml version='1.0' encoding='UTF-8' ?>
- <!DOCTYPE manualpage SYSTEM "./style/manualpage.dtd">
- <?xml-stylesheet type="text/xsl" href="./style/manual.en.xsl"?>
- <!-- $Revision: 1.1.2.15 $ -->
-
- <!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
- -->
-
- <manualpage metafile="content-negotiation.xml.meta">
-
- <title>Content Negotiation</title>
-
- <summary>
-
- <p>Apache supports content negotiation as described in
- the HTTP/1.1 specification. It can choose the best
- representation of a resource based on the browser-supplied
- preferences for media type, languages, character set and
- encoding. It also implements a couple of features to give
- more intelligent handling of requests from browsers that send
- incomplete negotiation information.</p>
-
- <p>Content negotiation is provided by the
- <module>mod_negotiation</module> module, which is compiled in
- by default.</p>
- </summary>
-
- <section id="about"><title>About Content Negotiation</title>
-
- <p>A resource may be available in several different
- representations. For example, it might be available in
- different languages or different media types, or a combination.
- One way of selecting the most appropriate choice is to give the
- user an index page, and let them select. However it is often
- possible for the server to choose automatically. This works
- because browsers can send, as part of each request, information
- about what representations they prefer. For example, a browser
- could indicate that it would like to see information in French,
- if possible, else English will do. Browsers indicate their
- preferences by headers in the request. To request only French
- representations, the browser would send</p>
-
- <example>Accept-Language: fr</example>
-
- <p>Note that this preference will only be applied when there is
- a choice of representations and they vary by language.</p>
-
- <p>As an example of a more complex request, this browser has
- been configured to accept French and English, but prefer
- French, and to accept various media types, preferring HTML over
- plain text or other text types, and preferring GIF or JPEG over
- other media types, but also allowing any other media type as a
- last resort:</p>
-
- <example>
- Accept-Language: fr; q=1.0, en; q=0.5<br />
- Accept: text/html; q=1.0, text/*; q=0.8, image/gif; q=0.6, image/jpeg; q=0.6, image/*; q=0.5, */*; q=0.1
- </example>
-
- <p>Apache supports 'server driven' content negotiation, as
- defined in the HTTP/1.1 specification. It fully supports the
- <code>Accept</code>, <code>Accept-Language</code>,
- <code>Accept-Charset</code> and<code>Accept-Encoding</code>
- request headers. Apache also supports 'transparent'
- content negotiation, which is an experimental negotiation
- protocol defined in RFC 2295 and RFC 2296. It does not offer
- support for 'feature negotiation' as defined in these RFCs.</p>
-
- <p>A <strong>resource</strong> is a conceptual entity
- identified by a URI (RFC 2396). An HTTP server like Apache
- provides access to <strong>representations</strong> of the
- resource(s) within its namespace, with each representation in
- the form of a sequence of bytes with a defined media type,
- character set, encoding, etc. Each resource may be associated
- with zero, one, or more than one representation at any given
- time. If multiple representations are available, the resource
- is referred to as <strong>negotiable</strong> and each of its
- representations is termed a <strong>variant</strong>. The ways
- in which the variants for a negotiable resource vary are called
- the <strong>dimensions</strong> of negotiation.</p>
- </section>
-
- <section id="negotiation"><title>Negotiation in Apache</title>
-
- <p>In order to negotiate a resource, the server needs to be
- given information about each of the variants. This is done in
- one of two ways:</p>
-
- <ul>
- <li>Using a type map (<em>i.e.</em>, a <code>*.var</code>
- file) which names the files containing the variants
- explicitly, or</li>
-
- <li>Using a 'MultiViews' search, where the server does an
- implicit filename pattern match and chooses from among the
- results.</li>
- </ul>
-
- <section id="type-map"><title>Using a type-map file</title>
-
- <p>A type map is a document which is associated with the
- handler named <code>type-map</code> (or, for
- backwards-compatibility with older Apache configurations, the
- MIME type <code>application/x-type-map</code>). Note that to
- use this feature, you must have a handler set in the
- configuration that defines a file suffix as
- <code>type-map</code>; this is best done with</p>
-
- <example>AddHandler type-map .var</example>
-
- <p>in the server configuration file.</p>
-
- <p>Type map files should have the same name as the resource
- which they are describing, and have an entry for each available
- variant; these entries consist of contiguous HTTP-format header
- lines. Entries for different variants are separated by blank
- lines. Blank lines are illegal within an entry. It is
- conventional to begin a map file with an entry for the combined
- entity as a whole (although this is not required, and if
- present will be ignored). An example map file is shown below.
- This file would be named <code>foo.var</code>, as it describes
- a resource named <code>foo</code>.</p>
-
- <example>
- URI: foo<br />
- <br />
- URI: foo.en.html<br />
- Content-type: text/html<br />
- Content-language: en<br />
- <br />
- URI: foo.fr.de.html<br />
- Content-type: text/html;charset=iso-8859-2<br />
- Content-language: fr, de<br />
- </example>
- <p>Note also that a typemap file will take precedence over the
- filename's extension, even when Multiviews is on. If the
- variants have different source qualities, that may be indicated
- by the "qs" parameter to the media type, as in this picture
- (available as JPEG, GIF, or ASCII-art): </p>
-
- <example>
- URI: foo<br />
- <br />
- URI: foo.jpeg<br />
- Content-type: image/jpeg; qs=0.8<br />
- <br />
- URI: foo.gif<br />
- Content-type: image/gif; qs=0.5<br />
- <br />
- URI: foo.txt<br />
- Content-type: text/plain; qs=0.01<br />
- </example>
-
- <p>qs values can vary in the range 0.000 to 1.000. Note that
- any variant with a qs value of 0.000 will never be chosen.
- Variants with no 'qs' parameter value are given a qs factor of
- 1.0. The qs parameter indicates the relative 'quality' of this
- variant compared to the other available variants, independent
- of the client's capabilities. For example, a JPEG file is
- usually of higher source quality than an ASCII file if it is
- attempting to represent a photograph. However, if the resource
- being represented is an original ASCII art, then an ASCII
- representation would have a higher source quality than a JPEG
- representation. A qs value is therefore specific to a given
- variant depending on the nature of the resource it
- represents.</p>
-
- <p>The full list of headers recognized is available in the <a
- href="mod/mod_negotiation.html#typemaps">mod_negotation
- typemap</a> documentation.</p>
- </section>
-
- <section id="multiviews"><title>Multiviews</title>
-
- <p><code>MultiViews</code> is a per-directory option, meaning it
- can be set with an <directive module="core">Options</directive>
- directive within a <directive module="core"
- type="section">Directory</directive>, <directive module="core"
- type="section">Location</directive> or <directive module="core"
- type="section">Files</directive> section in
- <code>httpd.conf</code>, or (if <directive
- module="core">AllowOverride</directive> is properly set) in
- <code>.htaccess</code> files. Note that <code>Options All</code>
- does not set <code>MultiViews</code>; you have to ask for it by
- name.</p>
-
- <p>The effect of <code>MultiViews</code> is as follows: if the
- server receives a request for <code>/some/dir/foo</code>, if
- <code>/some/dir</code> has <code>MultiViews</code> enabled, and
- <code>/some/dir/foo</code> does <em>not</em> exist, then the
- server reads the directory looking for files named foo.*, and
- effectively fakes up a type map which names all those files,
- assigning them the same media types and content-encodings it
- would have if the client had asked for one of them by name. It
- then chooses the best match to the client's requirements.</p>
-
- <p><code>MultiViews</code> may also apply to searches for the file
- named by the <directive
- module="mod_dir">DirectoryIndex</directive> directive, if the
- server is trying to index a directory. If the configuration files
- specify</p>
- <example>DirectoryIndex index</example>
- <p>then the server will arbitrate between <code>index.html</code>
- and <code>index.html3</code> if both are present. If neither
- are present, and <code>index.cgi</code> is there, the server
- will run it.</p>
-
- <p>If one of the files found when reading the directory does not
- have an extension recognized by <code>mod_mime</code> to designate
- its Charset, Content-Type, Language, or Encoding, then the result
- depends on the setting of the <directive
- module="mod_mime">MultiViewsMatch</directive> directive. This
- directive determines whether handlers, filters, and other
- extension types can participate in MultiViews negotiation.</p>
- </section>
- </section>
-
- <section id="methods"><title>The Negotiation Methods</title>
-
- <p>After Apache has obtained a list of the variants for a given
- resource, either from a type-map file or from the filenames in
- the directory, it invokes one of two methods to decide on the
- 'best' variant to return, if any. It is not necessary to know
- any of the details of how negotiation actually takes place in
- order to use Apache's content negotiation features. However the
- rest of this document explains the methods used for those
- interested. </p>
-
- <p>There are two negotiation methods:</p>
-
- <ol>
- <li><strong>Server driven negotiation with the Apache
- algorithm</strong> is used in the normal case. The Apache
- algorithm is explained in more detail below. When this
- algorithm is used, Apache can sometimes 'fiddle' the quality
- factor of a particular dimension to achieve a better result.
- The ways Apache can fiddle quality factors is explained in
- more detail below.</li>
-
- <li><strong>Transparent content negotiation</strong> is used
- when the browser specifically requests this through the
- mechanism defined in RFC 2295. This negotiation method gives
- the browser full control over deciding on the 'best' variant,
- the result is therefore dependent on the specific algorithms
- used by the browser. As part of the transparent negotiation
- process, the browser can ask Apache to run the 'remote
- variant selection algorithm' defined in RFC 2296.</li>
- </ol>
-
- <section id="dimensions"><title>Dimensions of Negotiation</title>
-
- <table>
- <columnspec><column width=".15"/><column width=".85"/></columnspec>
- <tr valign="top">
- <th>Dimension</th>
-
- <th>Notes</th>
- </tr>
-
- <tr valign="top">
- <td>Media Type</td>
-
- <td>Browser indicates preferences with the <code>Accept</code>
- header field. Each item can have an associated quality factor.
- Variant description can also have a quality factor (the "qs"
- parameter).</td>
- </tr>
-
- <tr valign="top">
- <td>Language</td>
-
- <td>Browser indicates preferences with the
- <code>Accept-Language</code> header field. Each item can have
- a quality factor. Variants can be associated with none, one or
- more than one language.</td>
- </tr>
-
- <tr valign="top">
- <td>Encoding</td>
-
- <td>Browser indicates preference with the
- <code>Accept-Encoding</code> header field. Each item can have
- a quality factor.</td>
- </tr>
-
- <tr valign="top">
- <td>Charset</td>
-
- <td>Browser indicates preference with the
- <code>Accept-Charset</code> header field. Each item can have a
- quality factor. Variants can indicate a charset as a parameter
- of the media type.</td>
- </tr>
- </table>
- </section>
-
- <section id="algorithm"><title>Apache Negotiation Algorithm</title>
-
- <p>Apache can use the following algorithm to select the 'best'
- variant (if any) to return to the browser. This algorithm is
- not further configurable. It operates as follows:</p>
-
- <ol>
- <li>First, for each dimension of the negotiation, check the
- appropriate <em>Accept*</em> header field and assign a
- quality to each variant. If the <em>Accept*</em> header for
- any dimension implies that this variant is not acceptable,
- eliminate it. If no variants remain, go to step 4.</li>
-
- <li>
- Select the 'best' variant by a process of elimination. Each
- of the following tests is applied in order. Any variants
- not selected at each test are eliminated. After each test,
- if only one variant remains, select it as the best match
- and proceed to step 3. If more than one variant remains,
- move on to the next test.
-
- <ol>
- <li>Multiply the quality factor from the <code>Accept</code>
- header with the quality-of-source factor for this variants
- media type, and select the variants with the highest
- value.</li>
-
- <li>Select the variants with the highest language quality
- factor.</li>
-
- <li>Select the variants with the best language match,
- using either the order of languages in the
- <code>Accept-Language</code> header (if present), or else
- the order of languages in the <code>LanguagePriority</code>
- directive (if present).</li>
-
- <li>Select the variants with the highest 'level' media
- parameter (used to give the version of text/html media
- types).</li>
-
- <li>Select variants with the best charset media
- parameters, as given on the <code>Accept-Charset</code>
- header line. Charset ISO-8859-1 is acceptable unless
- explicitly excluded. Variants with a <code>text/*</code>
- media type but not explicitly associated with a particular
- charset are assumed to be in ISO-8859-1.</li>
-
- <li>Select those variants which have associated charset
- media parameters that are <em>not</em> ISO-8859-1. If
- there are no such variants, select all variants
- instead.</li>
-
- <li>Select the variants with the best encoding. If there
- are variants with an encoding that is acceptable to the
- user-agent, select only these variants. Otherwise if
- there is a mix of encoded and non-encoded variants,
- select only the unencoded variants. If either all
- variants are encoded or all variants are not encoded,
- select all variants.</li>
-
- <li>Select the variants with the smallest content
- length.</li>
-
- <li>Select the first variant of those remaining. This
- will be either the first listed in the type-map file, or
- when variants are read from the directory, the one whose
- file name comes first when sorted using ASCII code
- order.</li>
- </ol>
- </li>
-
- <li>The algorithm has now selected one 'best' variant, so
- return it as the response. The HTTP response header
- <code>Vary</code> is set to indicate the dimensions of
- negotiation (browsers and caches can use this information when
- caching the resource). End.</li>
-
- <li>To get here means no variant was selected (because none
- are acceptable to the browser). Return a 406 status (meaning
- "No acceptable representation") with a response body
- consisting of an HTML document listing the available
- variants. Also set the HTTP <code>Vary</code> header to
- indicate the dimensions of variance.</li>
- </ol>
- </section>
- </section>
-
- <section id="better"><title>Fiddling with Quality
- Values</title>
-
- <p>Apache sometimes changes the quality values from what would
- be expected by a strict interpretation of the Apache
- negotiation algorithm above. This is to get a better result
- from the algorithm for browsers which do not send full or
- accurate information. Some of the most popular browsers send
- <code>Accept</code> header information which would otherwise
- result in the selection of the wrong variant in many cases. If a
- browser sends full and correct information these fiddles will not
- be applied.</p>
-
- <section id="wildcards"><title>Media Types and Wildcards</title>
-
- <p>The <code>Accept:</code> request header indicates preferences
- for media types. It can also include 'wildcard' media types, such
- as "image/*" or "*/*" where the * matches any string. So a request
- including:</p>
-
- <example>Accept: image/*, */*</example>
-
- <p>would indicate that any type starting "image/" is acceptable,
- as is any other type.
- Some browsers routinely send wildcards in addition to explicit
- types they can handle. For example:</p>
-
- <example>
- Accept: text/html, text/plain, image/gif, image/jpeg, */*
- </example>
- <p>The intention of this is to indicate that the explicitly listed
- types are preferred, but if a different representation is
- available, that is ok too. Using explicit quality values,
- what the browser really wants is something like:</p>
- <example>
- Accept: text/html, text/plain, image/gif, image/jpeg, */*; q=0.01
- </example>
- <p>The explicit types have no quality factor, so they default to a
- preference of 1.0 (the highest). The wildcard */* is given a
- low preference of 0.01, so other types will only be returned if
- no variant matches an explicitly listed type.</p>
-
- <p>If the <code>Accept:</code> header contains <em>no</em> q
- factors at all, Apache sets the q value of "*/*", if present, to
- 0.01 to emulate the desired behavior. It also sets the q value of
- wildcards of the format "type/*" to 0.02 (so these are preferred
- over matches against "*/*". If any media type on the
- <code>Accept:</code> header contains a q factor, these special
- values are <em>not</em> applied, so requests from browsers which
- send the explicit information to start with work as expected.</p>
- </section>
-
- <section id="exceptions"><title>Language Negotiation Exceptions</title>
-
- <p>New in Apache 2.0, some exceptions have been added to the
- negotiation algorithm to allow graceful fallback when language
- negotiation fails to find a match.</p>
-
- <p>When a client requests a page on your server, but the server
- cannot find a single page that matches the
- <code>Accept-language</code> sent by
- the browser, the server will return either a "No Acceptable
- Variant" or "Multiple Choices" response to the client. To avoid
- these error messages, it is possible to configure Apache to ignore
- the <code>Accept-language</code> in these cases and provide a
- document that does not explicitly match the client's request. The
- <directive
- module="mod_negotiation">ForceLanguagePriority</directive>
- directive can be used to override one or both of these error
- messages and substitute the servers judgement in the form of the
- <directive module="mod_negotiation">LanguagePriority</directive>
- directive.</p>
-
- <p>The server will also attempt to match language-subsets when no
- other match can be found. For example, if a client requests
- documents with the language <code>en-GB</code> for British
- English, the server is not normally allowed by the HTTP/1.1
- standard to match that against a document that is marked as simply
- <code>en</code>. (Note that it is almost surely a configuration
- error to include <code>en-GB</code> and not <code>en</code> in the
- <code>Accept-Language</code> header, since it is very unlikely
- that a reader understands British English, but doesn't understand
- English in general. Unfortunately, many current clients have
- default configurations that resemble this.) However, if no other
- language match is possible and the server is about to return a "No
- Acceptable Variants" error or fallback to the <directive
- module="mod_negotiation">LanguagePriority</directive>, the server
- will ignore the subset specification and match <code>en-GB</code>
- against <code>en</code> documents. Implicitly, Apache will add
- the parent language to the client's acceptable language list with
- a very low quality value. But note that if the client requests
- "en-GB; q=0.9, fr; q=0.8", and the server has documents
- designated "en" and "fr", then the "fr" document will be returned.
- This is necessary to maintain compliance with the HTTP/1.1
- specification and to work effectively with properly configured
- clients.</p>
-
- <p>In order to support advanced techniques (such as cookies or
- special URL-paths) to determine the user's preferred language,
- since Apache 2.0.47 <module>mod_negotiation</module> recognizes
- the <a href="env.html">environment variable</a>
- <code>prefer-language</code>. If it exists and contains an
- appropriate language tag, <module>mod_negotiation</module> will
- try to select a matching variant. If there's no such variant,
- the normal negotiation process applies.</p>
-
- <example><title>Example</title>
- SetEnvIf Cookie "language=en" prefer-language=en<br />
- SetEnvIf Cookie "language=fr" prefer-language=fr
- </example>
- </section>
- </section>
-
- <section id="extensions"><title>Extensions to Transparent Content
- Negotiation</title>
-
- <p>Apache extends the transparent content negotiation protocol (RFC
- 2295) as follows. A new <code>{encoding ..}</code> element is used in
- variant lists to label variants which are available with a specific
- content-encoding only. The implementation of the RVSA/1.0 algorithm
- (RFC 2296) is extended to recognize encoded variants in the list, and
- to use them as candidate variants whenever their encodings are
- acceptable according to the <code>Accept-Encoding</code> request
- header. The RVSA/1.0 implementation does not round computed quality
- factors to 5 decimal places before choosing the best variant.</p>
- </section>
-
- <section id="naming"><title>Note on hyperlinks and naming conventions</title>
-
- <p>If you are using language negotiation you can choose between
- different naming conventions, because files can have more than
- one extension, and the order of the extensions is normally
- irrelevant (see the <a
- href="mod/mod_mime.html#multipleext">mod_mime</a> documentation
- for details).</p>
-
- <p>A typical file has a MIME-type extension (<em>e.g.</em>,
- <code>html</code>), maybe an encoding extension (<em>e.g.</em>,
- <code>gz</code>), and of course a language extension
- (<em>e.g.</em>, <code>en</code>) when we have different
- language variants of this file.</p>
-
- <p>Examples:</p>
-
- <ul>
- <li>foo.en.html</li>
-
- <li>foo.html.en</li>
-
- <li>foo.en.html.gz</li>
- </ul>
-
- <p>Here some more examples of filenames together with valid and
- invalid hyperlinks:</p>
-
- <table border="1" cellpadding="8" cellspacing="0">
- <columnspec><column width=".2"/><column width=".2"/>
- <column width=".2"/></columnspec>
- <tr>
- <th>Filename</th>
-
- <th>Valid hyperlink</th>
-
- <th>Invalid hyperlink</th>
- </tr>
-
- <tr>
- <td><em>foo.html.en</em></td>
-
- <td>foo<br />
- foo.html</td>
-
- <td>-</td>
- </tr>
-
- <tr>
- <td><em>foo.en.html</em></td>
-
- <td>foo</td>
-
- <td>foo.html</td>
- </tr>
-
- <tr>
- <td><em>foo.html.en.gz</em></td>
-
- <td>foo<br />
- foo.html</td>
-
- <td>foo.gz<br />
- foo.html.gz</td>
- </tr>
-
- <tr>
- <td><em>foo.en.html.gz</em></td>
-
- <td>foo</td>
-
- <td>foo.html<br />
- foo.html.gz<br />
- foo.gz</td>
- </tr>
-
- <tr>
- <td><em>foo.gz.html.en</em></td>
-
- <td>foo<br />
- foo.gz<br />
- foo.gz.html</td>
-
- <td>foo.html</td>
- </tr>
-
- <tr>
- <td><em>foo.html.gz.en</em></td>
-
- <td>foo<br />
- foo.html<br />
- foo.html.gz</td>
-
- <td>foo.gz</td>
- </tr>
- </table>
-
- <p>Looking at the table above, you will notice that it is always
- possible to use the name without any extensions in a hyperlink
- (<em>e.g.</em>, <code>foo</code>). The advantage is that you
- can hide the actual type of a document rsp. file and can change
- it later, <em>e.g.</em>, from <code>html</code> to
- <code>shtml</code> or <code>cgi</code> without changing any
- hyperlink references.</p>
-
- <p>If you want to continue to use a MIME-type in your
- hyperlinks (<em>e.g.</em> <code>foo.html</code>) the language
- extension (including an encoding extension if there is one)
- must be on the right hand side of the MIME-type extension
- (<em>e.g.</em>, <code>foo.html.en</code>).</p>
- </section>
-
- <section id="caching"><title>Note on Caching</title>
-
- <p>When a cache stores a representation, it associates it with
- the request URL. The next time that URL is requested, the cache
- can use the stored representation. But, if the resource is
- negotiable at the server, this might result in only the first
- requested variant being cached and subsequent cache hits might
- return the wrong response. To prevent this, Apache normally
- marks all responses that are returned after content negotiation
- as non-cacheable by HTTP/1.0 clients. Apache also supports the
- HTTP/1.1 protocol features to allow caching of negotiated
- responses.</p>
-
- <p>For requests which come from a HTTP/1.0 compliant client
- (either a browser or a cache), the directive <directive
- module="mod_negotiation">CacheNegotiatedDocs</directive> can be
- used to allow caching of responses which were subject to
- negotiation. This directive can be given in the server config or
- virtual host, and takes no arguments. It has no effect on requests
- from HTTP/1.1 clients.</p>
- </section>
-
- <section id="more"><title>More Information</title>
-
- <p>For more information about content negotiation, see Alan
- J. Flavell's <a
- href="http://ppewww.ph.gla.ac.uk/~flavell/www/lang-neg.html">Language
- Negotiation Notes</a>. But note that this document may not be
- updated to include changes in Apache 2.0.</p>
- </section>
-
- </manualpage>
-